14 research outputs found

    Visual Analytics for the Exploratory Analysis and Labeling of Cultural Data

    Get PDF
    Cultural data can come in various forms and modalities, such as text traditions, artworks, music, crafted objects, or even as intangible heritage such as biographies of people, performing arts, cultural customs and rites. The assignment of metadata to such cultural heritage objects is an important task that people working in galleries, libraries, archives, and museums (GLAM) do on a daily basis. These rich metadata collections are used to categorize, structure, and study collections, but can also be used to apply computational methods. Such computational methods are in the focus of Computational and Digital Humanities projects and research. For the longest time, the digital humanities community has focused on textual corpora, including text mining, and other natural language processing techniques. Although some disciplines of the humanities, such as art history and archaeology have a long history of using visualizations. In recent years, the digital humanities community has started to shift the focus to include other modalities, such as audio-visual data. In turn, methods in machine learning and computer vision have been proposed for the specificities of such corpora. Over the last decade, the visualization community has engaged in several collaborations with the digital humanities, often with a focus on exploratory or comparative analysis of the data at hand. This includes both methods and systems that support classical Close Reading of the material and Distant Reading methods that give an overview of larger collections, as well as methods in between, such as Meso Reading. Furthermore, a wider application of machine learning methods can be observed on cultural heritage collections. But they are rarely applied together with visualizations to allow for further perspectives on the collections in a visual analytics or human-in-the-loop setting. Visual analytics can help in the decision-making process by guiding domain experts through the collection of interest. However, state-of-the-art supervised machine learning methods are often not applicable to the collection of interest due to missing ground truth. One form of ground truth are class labels, e.g., of entities depicted in an image collection, assigned to the individual images. Labeling all objects in a collection is an arduous task when performed manually, because cultural heritage collections contain a wide variety of different objects with plenty of details. A problem that arises with these collections curated in different institutions is that not always a specific standard is followed, so the vocabulary used can drift apart from another, making it difficult to combine the data from these institutions for large-scale analysis. This thesis presents a series of projects that combine machine learning methods with interactive visualizations for the exploratory analysis and labeling of cultural data. First, we define cultural data with regard to heritage and contemporary data, then we look at the state-of-the-art of existing visualization, computer vision, and visual analytics methods and projects focusing on cultural data collections. After this, we present the problems addressed in this thesis and their solutions, starting with a series of visualizations to explore different facets of rap lyrics and rap artists with a focus on text reuse. Next, we engage in a more complex case of text reuse, the collation of medieval vernacular text editions. For this, a human-in-the-loop process is presented that applies word embeddings and interactive visualizations to perform textual alignments on under-resourced languages supported by labeling of the relations between lines and the relations between words. We then switch the focus from textual data to another modality of cultural data by presenting a Virtual Museum that combines interactive visualizations and computer vision in order to explore a collection of artworks. With the lessons learned from the previous projects, we engage in the labeling and analysis of medieval illuminated manuscripts and so combine some of the machine learning methods and visualizations that were used for textual data with computer vision methods. Finally, we give reflections on the interdisciplinary projects and the lessons learned, before we discuss existing challenges when working with cultural heritage data from the computer science perspective to outline potential research directions for machine learning and visual analytics of cultural heritage data

    Visual Analysis of Engineers' Biographies and Engineering Branches

    Get PDF
    The Prosopographic Database of German Engineers 1825–1970 contains a multitude of biographical information. Given a set of research interests by collaborating historians, this paper discusses the steps undertaken (1) to extract engineering subjects from unstructured text entries in the database accompanied with geospatial and temporal information, (2) to adapt existing visual representations to facilitate exploratory analyses, and (3) to design a visual interface to support the interactive composition of engineering branches from engineering subjects to enable the comparative analysis of geospatial-temporal developments in engineering. Usage scenarios outline the benefit of the proposed visualizations for modern prosopography research

    Explorative Visual Analysis of Rap Music

    Get PDF
    Detecting references and similarities in music lyrics can be a difficult task. Crowdsourced knowledge platforms such as Genius. can help in this process through user-annotated information about the artist and the song but fail to include visualizations to help users find similarities and structures on a higher and more abstract level. We propose a prototype to compute similarities between rap artists based on word embedding of their lyrics crawled from Genius. Furthermore, the artists and their lyrics can be analyzed using an explorative visualization system applying multiple visualization methods to support domain-specific tasks

    Visualizing Similarities between American Rap-Artists based on Text Reuse

    Get PDF
    Rap music is one of the biggest music genres in the world today. Since the early days of rap music, references not only to pop culture but also to other rap artists have been an integral part of the lyrics’ artistry. Rappers may use them to introduce their shared personal backgrounds such as where they grew up. In addition, rap musicians reference each other by adopting fragments of lyrics, for example, to give credit. This kind of text reuse can be used to create connections between individual artists. Due to the large amount of lyrics, only automated detection methods can efficiently detect text reuse. In addition, automated methods can also be used to identify similar artists based on their lyrical content. Here, we present a visualization system for analyzing text reuse in rap music lyrics. The system supports the user of detecting text reuse and allusions between songs and exploring connections between artists. For this purpose, we crawled song lyrics and their metadata of selected American rap artists from Genius.com. We also trained a network tailored specifically for rap lyrics, which we named “rapBERTa”, to compute similarities in lyrics

    nn Walks in the Fictional Woods

    Full text link
    This paper presents a novel exploration of the interaction between generative AI models, visualization, and narrative generation processes, using OpenAI's GPT as a case study. Drawing on Umberto Eco's ``Six Walks in the Fictional Woods'', we engender a speculative, transdisciplinary scientific narrative plentiful with references and links to relevant talks. To enrich our exposition, we present a visualization prototype to analyze storyboarded narratives, and extensive conversations with ChatGPT. Our paper is thoroughly decorated with thoughtful decorations that try to encode meaning and complement the narrative.Comment: this is a submission for alt.vis 202

    Visual Analytics for the Exploratory Analysis and Labeling of Cultural Data

    No full text
    Cultural data can come in various forms and modalities, such as text traditions, artworks, music, crafted objects, or even as intangible heritage such as biographies of people, performing arts, cultural customs and rites. The assignment of metadata to such cultural heritage objects is an important task that people working in galleries, libraries, archives, and museums (GLAM) do on a daily basis. These rich metadata collections are used to categorize, structure, and study collections, but can also be used to apply computational methods. Such computational methods are in the focus of Computational and Digital Humanities projects and research. For the longest time, the digital humanities community has focused on textual corpora, including text mining, and other natural language processing techniques. Although some disciplines of the humanities, such as art history and archaeology have a long history of using visualizations. In recent years, the digital humanities community has started to shift the focus to include other modalities, such as audio-visual data. In turn, methods in machine learning and computer vision have been proposed for the specificities of such corpora. Over the last decade, the visualization community has engaged in several collaborations with the digital humanities, often with a focus on exploratory or comparative analysis of the data at hand. This includes both methods and systems that support classical Close Reading of the material and Distant Reading methods that give an overview of larger collections, as well as methods in between, such as Meso Reading. Furthermore, a wider application of machine learning methods can be observed on cultural heritage collections. But they are rarely applied together with visualizations to allow for further perspectives on the collections in a visual analytics or human-in-the-loop setting. Visual analytics can help in the decision-making process by guiding domain experts through the collection of interest. However, state-of-the-art supervised machine learning methods are often not applicable to the collection of interest due to missing ground truth. One form of ground truth are class labels, e.g., of entities depicted in an image collection, assigned to the individual images. Labeling all objects in a collection is an arduous task when performed manually, because cultural heritage collections contain a wide variety of different objects with plenty of details. A problem that arises with these collections curated in different institutions is that not always a specific standard is followed, so the vocabulary used can drift apart from another, making it difficult to combine the data from these institutions for large-scale analysis. This thesis presents a series of projects that combine machine learning methods with interactive visualizations for the exploratory analysis and labeling of cultural data. First, we define cultural data with regard to heritage and contemporary data, then we look at the state-of-the-art of existing visualization, computer vision, and visual analytics methods and projects focusing on cultural data collections. After this, we present the problems addressed in this thesis and their solutions, starting with a series of visualizations to explore different facets of rap lyrics and rap artists with a focus on text reuse. Next, we engage in a more complex case of text reuse, the collation of medieval vernacular text editions. For this, a human-in-the-loop process is presented that applies word embeddings and interactive visualizations to perform textual alignments on under-resourced languages supported by labeling of the relations between lines and the relations between words. We then switch the focus from textual data to another modality of cultural data by presenting a Virtual Museum that combines interactive visualizations and computer vision in order to explore a collection of artworks. With the lessons learned from the previous projects, we engage in the labeling and analysis of medieval illuminated manuscripts and so combine some of the machine learning methods and visualizations that were used for textual data with computer vision methods. Finally, we give reflections on the interdisciplinary projects and the lessons learned, before we discuss existing challenges when working with cultural heritage data from the computer science perspective to outline potential research directions for machine learning and visual analytics of cultural heritage data

    Visual Analytics for the Exploratory Analysis and Labeling of Cultural Data

    No full text
    Cultural data can come in various forms and modalities, such as text traditions, artworks, music, crafted objects, or even as intangible heritage such as biographies of people, performing arts, cultural customs and rites. The assignment of metadata to such cultural heritage objects is an important task that people working in galleries, libraries, archives, and museums (GLAM) do on a daily basis. These rich metadata collections are used to categorize, structure, and study collections, but can also be used to apply computational methods. Such computational methods are in the focus of Computational and Digital Humanities projects and research. For the longest time, the digital humanities community has focused on textual corpora, including text mining, and other natural language processing techniques. Although some disciplines of the humanities, such as art history and archaeology have a long history of using visualizations. In recent years, the digital humanities community has started to shift the focus to include other modalities, such as audio-visual data. In turn, methods in machine learning and computer vision have been proposed for the specificities of such corpora. Over the last decade, the visualization community has engaged in several collaborations with the digital humanities, often with a focus on exploratory or comparative analysis of the data at hand. This includes both methods and systems that support classical Close Reading of the material and Distant Reading methods that give an overview of larger collections, as well as methods in between, such as Meso Reading. Furthermore, a wider application of machine learning methods can be observed on cultural heritage collections. But they are rarely applied together with visualizations to allow for further perspectives on the collections in a visual analytics or human-in-the-loop setting. Visual analytics can help in the decision-making process by guiding domain experts through the collection of interest. However, state-of-the-art supervised machine learning methods are often not applicable to the collection of interest due to missing ground truth. One form of ground truth are class labels, e.g., of entities depicted in an image collection, assigned to the individual images. Labeling all objects in a collection is an arduous task when performed manually, because cultural heritage collections contain a wide variety of different objects with plenty of details. A problem that arises with these collections curated in different institutions is that not always a specific standard is followed, so the vocabulary used can drift apart from another, making it difficult to combine the data from these institutions for large-scale analysis. This thesis presents a series of projects that combine machine learning methods with interactive visualizations for the exploratory analysis and labeling of cultural data. First, we define cultural data with regard to heritage and contemporary data, then we look at the state-of-the-art of existing visualization, computer vision, and visual analytics methods and projects focusing on cultural data collections. After this, we present the problems addressed in this thesis and their solutions, starting with a series of visualizations to explore different facets of rap lyrics and rap artists with a focus on text reuse. Next, we engage in a more complex case of text reuse, the collation of medieval vernacular text editions. For this, a human-in-the-loop process is presented that applies word embeddings and interactive visualizations to perform textual alignments on under-resourced languages supported by labeling of the relations between lines and the relations between words. We then switch the focus from textual data to another modality of cultural data by presenting a Virtual Museum that combines interactive visualizations and computer vision in order to explore a collection of artworks. With the lessons learned from the previous projects, we engage in the labeling and analysis of medieval illuminated manuscripts and so combine some of the machine learning methods and visualizations that were used for textual data with computer vision methods. Finally, we give reflections on the interdisciplinary projects and the lessons learned, before we discuss existing challenges when working with cultural heritage data from the computer science perspective to outline potential research directions for machine learning and visual analytics of cultural heritage data

    Visual Analysis of Engineers' Biographies and Engineering Branches

    No full text
    The Prosopographic Database of German Engineers 1825–1970 contains a multitude of biographical information. Given a set of research interests by collaborating historians, this paper discusses the steps undertaken (1) to extract engineering subjects from unstructured text entries in the database accompanied with geospatial and temporal information, (2) to adapt existing visual representations to facilitate exploratory analyses, and (3) to design a visual interface to support the interactive composition of engineering branches from engineering subjects to enable the comparative analysis of geospatial-temporal developments in engineering. Usage scenarios outline the benefit of the proposed visualizations for modern prosopography research

    Automated Alignment of Medieval Text Versions based on Word Embeddings

    Get PDF
    Medieval textuality is characterized by instability in text structure and length that varies according to the text tradition. This instability in the versions, otherwise known as “mouvance”, is characterized by dialectal difference, traces of orality, the modification of wording and even the rewriting and rearrangement of large parts of the text. To help humanities scholars in the exploratory analysis of such complex text collections, the visual analytic system iteal was initially proposed. The system aligns similar phrases on a line-level on the basis of string similarity and word n-grams. We propose an extension of this system that replaces the parameter-based approach with an automatic one using word embeddings thereby adding a semantic component. The benefit of the new visualization system is shown through a comparison of different versions of medieval French texts. Additionally, a domain-expert compared the parameter-based approach with the approach based on word embeddings to outline the similarities and differences in the alignments

    Explorative Visual Analysis of Rap Music

    No full text
    Detecting references and similarities in music lyrics can be a difficult task. Crowdsourced knowledge platforms such as Genius. can help in this process through user-annotated information about the artist and the song but fail to include visualizations to help users find similarities and structures on a higher and more abstract level. We propose a prototype to compute similarities between rap artists based on word embedding of their lyrics crawled from Genius. Furthermore, the artists and their lyrics can be analyzed using an explorative visualization system applying multiple visualization methods to support domain-specific tasks
    corecore